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SHEPPARD’S CORRECTIONS FOR A DISCRETE VARIABLE 
CrciL C. CRAIG 


In the Annals of Mathematical Statisties,! J. R. Abernethy gave a derivation 
of the corrections to eliminate the systematic errors in the moments of a discrete 
variable due to grouping. It is the purpose of this note to considerably shorten 
and simplify the derivation of these corrections by an adoption of a device used 
by R. A. Fisher (not published so far as I know) in the case of the ordinary 
Sheppard’s corrections. 

Let us suppose that m consecutive values of the discrete variable in question 
are grouped in a frequency class of width k. The m smaller intervals of width 
k/m go to make up the class width k, the actual points representing the m 
values of the variable being plotted at the centers of the sub-intervals. Now 
let us suppose that each of m consecutive boundary points of the sub-intervals is 
as likely to be chosen as a boundary point of the larger intervals as any other. 
Then, if x; is the class mark of the i-th frequency class, for any true value, x, of 
the discrete variable included in this frequency class, we have 


Vi=r+e 


in which x and ¢ are independent variables and ¢ takes on the m values 


m—l,, ; =a B.., m=—i., 
—— 5 Bim ——g— k/m, +++ ,—3— k/m, —— k/m, 


with the equal relative frequencies 1/m. 

The moments of x; are those calculated from the grouped frequency distri- 
bution; the problem is to express the average values of the moments of x in 
terms of the calculated moments and k and m. The use of moment generating 
functions at once leads to the desired results. Denoting the s-th moment of 2; 
about any origin by v., the like moment of 2 by u., the respective moment 
generating functions of the two variables by M,,(0) and M,(#) respectively, 
we have at once 


(1) M,,(8) = Md) 


1“On the Elimination of Systematic Errors Due to Grouping,” vol. IV (1933), pp. 
263-277. 
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in which by definition 
M,,(8) = 14+ 018 + vg 8/2! 4 v3 8/3!4 ---, 
M.(8) = 1 + wid + ws 0/2! + ws 0°/3!4 ---. 


The computation necessary to get the actual corrections consists in the calcula- 
tion of the coefficients in the formal expansion of 
m—1 


—— k/m 


2 


,e0 


(2) M(8) = 
m 


— k/m 


‘ P ‘ ’ —_ 
in powers of # and then solving for the u,’s in (1). 

But the summation indicated in (2) is readily effected by means of the calculus 
of finite differences. In fact, we get 


_— 1 
kod/m _ — kJ/m 


m+1 
— 2 


(3) Mis) = ° —-¢ _ _ sinh kd/2 


m(eh9™ — 1) m sinh kd/2m 
Then (2) becomes 
sinh ke /2 
4 M,,(8) = M,(8) —_____ . 
(4) (9) (9) m sinh kd /2m 
If we let m — x we get the corresponding result for a continuous variable 


sinh kd /2 
5 M.,(8) = M,(8) ——.— 
(5) (8) = M.(0) 
already given by Langdon and Ore,’ though in a less elegant manner; for in this 
case, the expression analogous to (1) is immediately seen to be 


k/2 
M,,(3) = M.(8) I ce? de/k. 
—k/2 


Returning to (4), taking the logarithms of both sides, remembering that the 
logarithm of the moment generating function is the generating function of the 
semi-invariants of Thiele, we get, 


hid + Az 82/2! + Az #/3l 4 --- 


(6) kd /2m sinh kd /2 
«hss hots heels ...-eg 2. 
oe ares ee B kd /2 sinh kd /2m 
in which the },’s are the calculated semi-invariants and the \,’s the corrected 
ones. 


2W. H. Langdon and O. Ore, Semi-invariants and Sheppard’s Corrections, Annals of 
Mathematics, vol. 31 (1930), pp. 230-232. 
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But since 
co 


™ — x -* (—1)41 (2x)? 
° =1 


cay 


we have on setting: 





kd /2m sinh kd /2 
ig —log a 2/2! 2 93/31 sie 
a = 0, desi1 = O, s = 0, 1, 2, cee 
(8) (=1)4 Bk 1 


Obviously these a’s are the ‘‘Sheppard’s’”’ corrections for the semi-invariants. 
We have generally 


Aosta = Azs4ty s=0,1,2,--- 


B, k2s 1 1 
2s ms J ° 
In particular 


de = he —_ ¢ — 5) k?/12 As Ne _— (: — 4) k§/ 25 52 
m 
As ds + ¢ ~ 4) k§/240. 
m 


For m — ~, these give of course the results reached by Langdon and Ore.’ 
To get the corrections for the moments let us set 





(1 - dye 120 As 


ee = a + ad + ae #/2! + a3 3/314 ---. 


From (7) and (8) 





a = l, Oni = 0, n = 0,1, 2, --- 
(9) > (2n)! agajag --- 
ia = > 
, (21)" (41° (6!) --- ristt! --- 
the summation extending over all positive, integral values of r, s, t, --- for which, 


r+2s+ 3+ ---=n. 
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Then finally we have the formula, 
He 
(10) Le = 2, () eMac ’ 


s=0 


for the corrected moments. 
Writing out the first four a’s, we have for the first eight moments about the 
mean 


= y 0 
ve — (1 — 1/m?) k?/12 
v3 
vy — (1 — 1/m?) gk? /2 + (1 — 1/m2)(7 — 3/m?) k*/240. 
vs — 5(1 — 1/m?)v3k?/6. 
5(1 — 1/m?) yk?/4 + (1 — 1/m?)(7 — 3/m?) 2 k*/16 
— (1 — 1/m?)(31 — 18/m? + 3/m') k®/1344 
7(1 — 1/m?) v5k?/4 + 7(1 — 1/m?)(7 — 3/m?) vs k*/48 
7(1 — 1/m?2) vg k?/3 + 7(1 — 1/m?)(7 — 3/m?) vy kt/24 
— (1 — 1/m?)(31 — 18/m? + 3/m*)) ve k*/48 
+ (1 — 1/m?)(381 — 239/m? + 55/m* — 5/m®) k8/11520. 


The final term in pe, as given above is ae,. 

The above method is readily extended to the case of two or more variables. 
We will illustrate the procedure by getting the results likely to be required for 
two variables. As before we suppose that m consecutive values of x are grouped 
in a frequency class of width k, and we shall similarly suppose that n values of y 
are grouped in a frequency class of width 1. And arguing as before we write now 


ri=r+e 
w= yt 9 


in which ¢ and 7 are independent of x and y and of each other. 
The moment generating function of two variables is defined by the identity 
in d and a: 


, , 1 , 9 , , ° 
Mz,,(8, @) = 1+ (uro8 + woie) + 5 (woo? + 2u110w + wyow) + --- 


, , Be , \(9 1 ’ ’ 3 
= 14 (419% + wo1o) + 5) M108 + yo)? + 3 (#108 + pow) + --- 


_ ‘ . ‘ : os ’ i is ; 
in which the manner of expansion of (419% + uy 1)‘” is evident. 





SHEPPARD’S CORRECTIONS FOR A DISCRETE VARIABLE 


Then from the properties of moment generating functions, we have 


m—1 : n—1 
k jm <a 
2 2 


, } ned+-nw 
Mi;,2;(8, 0) = Mz, ,(8, w) 2 > ghee 
jewtal p/m mn 


l/n 


sinh ke/2 sinh lw/2 


= M, . sis cts Mails «ote i * 
1 z,(3, ») m sinh k3/2m n sinh lo/2n 


As in the case of a single variable it will be simpler first to get the corrections 
for the semi-invariants. The logarithm of the moment generating function 
is the generating function of the semi-invariants; thus 


; 1 1 
log Mz,(9, w) = (Ajo? “+ Ai0w) os 91 (Axo? + Now) @ + 31 (Axo? 4 Aoiw) -t a 


in which 


(Aro os Aorw) = Ago? + 3d910?w oe 3A 120 w? a Ao3w* , 
CU. 
We write (see (7)), 
m sinh kd /2m , 
] oo we 2) 2! wo /4) oe ? 
. aS ES Tee + 
(12) 
n sinh lw/2n 


log sinh lw/2 


= be w?/2! + by wt/4!4 ---, 

with 

_ (—1)B,k 
2r 

_ (= 18.0 


s 


(1 — 1/m?") 


bes (1 — 1/n*). 


Then from (11) we have 
(Aro? + Ao) 2+ = (Azo + Agrw) Cet) , s = 0, 1, 2, --- 
(Aro? oa Now) &* = (Aro? oo Now) 2" ~ Ao,07* nt bows, s = a 2, 3, ae iy 


in which, of course, \,. is a calculated semi-invariant and \,, a corrected one. 
We read off 


Ars = Ars ; rs ~ 0, 
as already shown by Wold in the case of continuous variables,‘ 


Aosit,0 = Avsit,0 » No, 2841 = Ao, 2841 - 


¢ Herman Wold: Sheppard’s Correction Formulae in Several Variables: Skandinavisk 
Aktuarietidskrift, vol. XVII (1934), pp. 248-255. 
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The values of Aos,o are the same as those for \2, given above and those for Xo, », 
are obtained from these merely by replacing in them m and k by n andl. And 
it is quite obvious that for any number of variables the only semi-invariants to 
be corrected are those in which a single figure of the index is different from 
zero and is moreover even. For such semi-invariants the corrections are 
naturally those derived for a single variable. 

Now to derive the corrections for the moments, we write 


m sinh k3/2m  nsinh lw/2n _ 
sinh kd /2 sinh lw/2 


51/2! (a2d74 bow") 4+ 1/4! (ay9*+byw') fo eee 


= 1 + 1/2! (a2 + agzw*) + 1/4! (a2od® + arozw?)® + --- 


’ 


with now, 


(a9 + age) = z. (2h) ! (a2 + be)" (as + ba)* - 


(2!) (4's --erist---.  ’ 
the summation to be over all! positive integral values of r, s, --- for which 
r+2s+.-.--=h 


and in which the parameters 3 and w may be omitted without ambiguity. 
The formula for the corrected moments can now be written 


[p/2] 


(14) (u10 + Ho)” = z. (?) (a29 + a2) (vio + v9 1) (P24 


q=0 
This gives 


M10 + Mei = Fis + Vo. 
/ / 9) , / (2) 
(ui0 + Moi)” (vig + 91)! + (a2 + coz) 
Ee , , a / , ‘ e , ’ 
(15) (10 + Ho1)® = (V1 0 + ¥o1)® + 3(v4 0 + Vo1) (a29 + a2) 


/ 


(ui0 + #01) = (vio tr91) 4+ 6(r4 0 + 791) (cap + age) + (a9 + carne)? 


Noting that, 
(a9 + a2) = ay + bs + 3(a2 4+ be), 
we get the following formulas for the correction of the product moments about 
an arbitrary origin: 
, , 
Piri 714 
1/m?) vo, k2/12 
1/n?) vio [?/12 
1/m?) vy, k®/4 
1/m?) v5 2/12 — (1 — 1/n?) voo k2/12 


— (1 — 1/m?) (1 — 1/n?) k*l?/144 
1/n®) vi, 2/4. 
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The above results give the corrections for moments about the mean, merely 
by dropping the primes and setting v9 = vi = 0. In practice the corrections 
needed are for moments about the mean, and though there would be no difficulty 
in computing additional results for an arbitrary origin, I shall give here only the 
additional results for moments about the mean through the sixth order, omitting 
those obtained merely by permutation of subscripts and interchange of k and m 
with / and n respectively. 

First, the necessary extension of (15) is 

(u10 + wor) = (rio + voi) + 10 (ro + vor) (29 + coz) 
(15) (uo + por) = (rio + vor) + 15 (ro + vor) (a2 + a2) 
+ 15 (ry + vor) (a2 + aor) + (a9 + a2) . 

We need the additional relation: 

(a2 + aoe)® = ag + be + 15(as + ba) (a2 + Be) + 15 (a2 + de)’. 

The additional formulas for product moments about the mean follow: 
pa = Va — (1 — 1, m*) vei k?. 12 


use = V3 — (1 — 1/n?) v9 P/2 — (1 — 1/m?*) vy k?/4 


wan = wn — (1 1/m?*) 5v3, k?/6 + (1 — 1/m?) (7 — 3/m?) vy kt/48 


pa = ve — (1 — 1/n?) vy 2/12 — (1 — 1/m?) vee k?/2 
4 (1 — 1/m?) (1 — 1/n2) vo k?P/24 
+ (1 — 1/m?) (7 — 3/m?) 192 k*/240 — (1 — 1/m?) (7 — 3/m?) (1 — 1/n?) k4P?/2880 
33 = v33 — (1 — 1/m?) m3 k2/4 — (1 — 1/n?) vy 17/4 
+ (1 — 1/m?) (1 — 1/n?) my, 22/16. 


For m and n infinite these results give the formulas for two continuous vari- 
ables already found by Baten’ and Wold.® 

The reader will note that this development does not impose the “high contact”’ 
condition, except in so far as it assumes the existence of the moments that 
occur in the formulas. And it exhibits in the clearest fashion that Sheppard’s 
corrections are corrections on the average. 


UNIVERSITY OF MICHIGAN. 


’W. D. Baten: Corrections for the Moments of a Frequency Distribution in Two Vari- 
ables; Annals of Mathematical Statistics, vol. II (1931), pp. 309-319. 
6 Loc. cit., p. 253. 





FUNDAMENTALS OF THE THEORY OF INVERSE SAMPLING! 
By Cuine-Lal SHEN 
Part I. Introduction’ 
SEcTION I. SratisTicAL CONCEPTS OF THE THEORY OF SAMPLING 


One of the chief objects in statistics is to form a judgment of a very large 
statistical universe, known as a parent population, by means of a study of a part 
or sample thereof, which is drawn at random. To make a complete survey of 
the parent population is sometimes impossible or impractical. For example, 
it is impossible to measure the heights of all adult persons in a country. It is 
impractical to test for infectious bacteria the whole body of water in a city 
reservoir. All that we can do is to obtain an unbiased sample. By an unbiased 
sample, we mean a sample in which each individual has an equal and independent 
chance to be included. From this chosen sample we attempt to draw some con- 
clusion concerning the nature of the whole parent population in accordance with 
certain mathematical principles. 

Now the sample which we choose is of course only one of the samples that can 
be possibly drawn from a given parent population. Suppose there is a popula- 
tion of s individuals from which we wish to choose a sample of r. It is clear 
that there exist ,C, such samples, each of which is equally likely to be chosen. 
Therefore these ,C; samples constitute the so-called distribution of samples. 
To describe from the statistical point of view the distribution of samples, we 
must find its mean, standard deviation, skewness, excess, and other higher 
characteristics. The first three are usually referred to as elementary statistical 
functions. 

Suppose x; be the variate (by which we mean the magnitude of a specified 
character of an individual to be measured) where z = 1, 2,3, --- s; and z; be the 
samples chosen from the parent population where 7 = 1, 2, 3, ---.C,. Then 
the ,C, samples, each consisting of r variables, will be formed after the following 
fashion: 


m1+a%+%4%34+---4+2, 
te + a3 + 44+ --- 


= Ms—r4t + Vs-r+e + Usrip tees +2 


1 A dissertation submitted in partial fulfillment of the requirement for the degree of doc- 
tor of philosophy in the University of Michigan. 

2 The writer wishes to express his appreciation for the assistance Professor H. C. Carver 
has given him in making this study. 
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If we denote the nth moment of the parent population about its mean by 

Zz (x; = M:)" 

i=l 


§ 


u nz 





and the nth moment of the distribution of samples about its mean by 
(*) 
Zz (z; -—— M,)" 

= 


Pass = 
s 
r 


and if we then utilize the multinomial theorem, we may be able to express the 
sample moments in terms of the moments of the parent population :° 




















(M. =rM, 
- ( Sila:2 | 
a2 = 214P. 
pon or f 
| ($i 
fia:e = 31( Ps —— 
oo 3 31 
| 
(1) | ( Si P? 2-2 
/ a SM4:2 9 § Mo. | 
2 4! P. ccmnataniameses a aa 
- “tar + or pp 
| (sii S° jis: il2:2 | 
us oi 5:2 e 3:2 2:2 \ 
i — a Ps 5! + P3P» ° 3! 2! ot 
| f Sis 8? jis: ile:2 | 
—_ — ! Zz > a 4 :z \ 
Bee = Pe + ae 
a Pi Sis | Pa Sis ot, 
a! (3)? ' 3! Qype’ - 


where P,, is obtained from the sampling polynomial P,,(p) by writing pas p;: 


| Px(p) — 
Pee =p— ? 
(2) P3(p) = p — 3p? + = 2p 
\Pa(o) = p — Tp? + 12? — 6p 
P5(p) = p — 15p? + 50p? — 60p* + 24° 
Pes = p — 3l1p? + 180p? — 390p* + 360p> — 120p%, ete. 


where 


r(r _ ie ~~... ~{. f) 
sls ~ Ihe — @ «.-- g — § ~ }) 


i 


3 Carver, H. C., Annals of Mathematical Statistics, Vol. 1, No. 1, pp. 106-107. 
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Section Il. FREQUENCY CURVE OF THE DISTRIBUTION OF SAMPLES 


The frequency distribution of samples is usually less scattered than individual 
observations. In order to ascertain the manner of the distribution, we have 
access to the well-known Type A Curve of Charlier.+ 


(3) FO = ot) — 3,9°O + 7 4°O — 5490 + + 


1 
where ¢(t) = —- 
V/ 290 


= © 
5 — 10a; 
= a+ 
— 2la; + 10a; 
— 28a, + 210a, — 315, ete. 


This formula is a powerful tool for representing any frequency; but it is 
emphasized by more than one author’ that the usefulness of such a series repre- 
sentation of a frequency distribution depends upon the rapidity of convergence, 
and the rapidity of convergence in turn depends upon the extent to which the 
function ¢(¢) is a fair approximation for F(t). We shall not, however, discuss 
here the question of convergence. What we are interested in is to apply this 
series representation to the distribution of samples and see whether our numerical 
experimentation justifies the use of it. 


TABLE I 
Heights of 1000 Freshman Students 
(Original Measurements Made to Nearest 0.1 in.) 


Class 


58. 5-60. 
60. 5-62. 
62.5-64. 
64. 5-66. 
66. 5-68. 
68.5-70. 
70.5-72. 
72.5-74. 
74.5-76. 
76.5-78. 


Frequency 


2 
15 
76 
167 
339 
264 
106 

29 


Pe Pee ee ee 


‘Camp, B. H., The Mathematical Part of Elementary Statistics, p. 226. 
* Rietz, H. L., Mathematical Statistics p. 62. 


Carver, H. C., Frequency Curves, Handbook of Mathematical Statistics, p. 115. 
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First of all, therefore, we take for our numerical example the heights of 1000 
freshman students in the University of Michigan, as recorded in Table I, which 
are assumed to constitute our parent population. 

From the above data we compute the first 6 moments as follows: 


67.91 
6.279 ,068 o, = 2.505,81 
0.489 552 an = 6.061, 11 
132. 685,214 4:2 = 3.365,36 
78.435 ,794 Qs:2 = 0.793,92 
= 4574.080,554 :2 = 18.476,43 


Now suppose from this parent population in which s = 1000, we wish to 
choose jooCi samples, each consisting of 100 individuals. To characterize 
the distribution of these samples, we first make the following table: 


TABLE II 
Values of p; and P; for s = 1000, r = 100 


A = a 
p2 .009 ,909 ,909 , 91 
p3 .000 ,973 ,117 ,406 
ps 000 ,094 ,676 ,417 ,6 
ps .000 ,009 , 125 ,437 ,84 
D6 .000 ,000 ,871 ,272 ,959 ,5 
P, a 
P 090 ,090 ,090 ,09 
P; 072 ,216 ,505 ,082 
P, .041 ,739 ,980 ,994 
P; = —.005,454 352,918 
— .065 ,789 ,272 , 230 
008 ,058 ,351 ,516 
P>P3 006 ,472 ,571 ,500 
P.P, 003 ,764 ,792 ,358 
P? 000 ,715 ,593 , 194 
P2 =  .005,195,978,741 


Substituting into formulae (1), we obtain the first six moments of the distri- 
bution of samples: 


M, = 6791 

— 565.621 ,622 = 23.782,8 
jis: = 35.353 ,734 = .002,628 
jis = 958 ,720. 852,854 = 2.996 ,679 
iiss: = 198 ,538.702,142 :2 .026 ,093 
jis:: = 2,704,514,780.791 ,465 2 = 14.945,539 
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The coefficients of Charlier’s Type A Curve turn out to be very small and 
rapidly decreasing: 






























.000 , 438 


— .000,138 


— .000,016 


— .000 , 006 





We therefore may be justified in considering this series representation of the 
sample distribution as converging rapidly to the normal curve. It may be 
interesting to note that even from a parent population which is very skew, the 
distribution of samples is nearly normal—as the following example will show: 





TABLE III 
Weights of 1000 Freshman Students 
(Original Measurements Made to Nearest Pound) 




















Class Frequency 
85- 1 
95- | 8 

105- 45 

115- 132 

125-— 232 

135- 244 

145- 161 

155- 97 

165- 50 

175- | 16 

185- | 7 

195- 3 

4 


205- 


















139.32 


iie:2 = 296 . 8343 6, = 17.228 ,87 
iis: = 3 230.802 a3:2 = 0.631,74 
fice ™ 351,180.14 a4:2 = 3.985,67 
jiszze = 11,811,480.5 as:2 = 7.780,71 


886 ,585 ,271 
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13 ,932 
26 ,741. 828 ,829 = 163.529 
233 ,317.229 ,045 .05334 
2,144,736 ,851.477 ,805 2.9991 
62 ,008 ,368 ,279. 121 ,883 . 53024 
287 , 107 ,828 ,746 ,809.017 15.00633 


008,89 

= — .000,04 
“5 _ _ 000,03 
ao! 


.000 , 03 


Indeed the distribution of samples, in general, is very nearly normal irre- 
spective of the law of distribution of the parent population. From the practical 
point of view, as Professor H. C. Carver has remarked, the parent population has 
little control over the shape of the distribution of the samples of r is fifty or 
greater and if S is at least ten times as large as r.® 

Now as a numerical illustration of the theory of sampling I may, for example, 
choose at random 100 weights from the parent population of 1000 weights of 
freshman students, as recorded in Table III, with the aim of ascertaining the 
probability that the mean of this sample exceeds 142 pounds. 

Since we define the mean of a sample simply as the average measurement of 
the r individuals in the sample, which in this case is 100, it therefore follows that 
the ordinary moments of the distribution of sample means differ from those of 
the distribution of samples in (1) only by a constant multiple of 1/r* where k is 
the order of the moments concerned, while the standardized moments remain 
unchanged. Therefore in this problem, we have the mean of the sample means 
equal to 139.32 and the standard deviation equal to 1.63529. The average 
weight, 142 pounds, may be expressed in standard units as 


z—M, _ 142 — 139.32 


t= — ee 163599 = 1.63885 





In accordance with (3), the probability that the mean of the sample exceeds 
142 pounds is therefore equal to 


e 


-63885 


P= f. | oe — O° + OOH — OW + |e 


® Carver, H. C., Annals of Mathematical Statistics, Vol. I, No. I, p. 112. 
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If we take the first term only, P = o(t)dt = .05062. 
1 63885 
If we take the first two terms, P = | o(t)dt — ooss99 (0 | = 
1 .63885 1.63885 

.05218. 

If we take the first three terms, 

P= i o(t)di — 008896 (| + (—.000,08)6( | = .052182. 

1.63885 1.63885 1.63885 


Section III. PrEaRsSONIAN Types OF CURVES 


Charlier’s Type A Series is, however, not the only known analytic representa- 
tion of a frequency distribution. There are Pearsonian Types of Curves, the 
characteristics of which I shall need to summarize briefly. These Pearsonian 
Types of Curves are essential to the later development of our theory. 

The curves, suggested by certain geometrical properties of unimodal frequency 
distribution, are all obtained from the solution of the differential equation: 


ldy _a—t 


ydt f(t) 
where f(t) is assumed to be possibly expanded into a convergent power series, 
that is, f(t) = bo + bit + bof? + ---. When the first three terms of the power 
series are taken, the differential equation immediately takes the form of 
ldy _ a--t 
y dt — by + bit + bef? 
terms of moments :? 


The parameters, a, bo, b1, b2, may be expressed in 


a3 2+ 6 
Ae FES  — bo = —_.... 
2(1 + 26) 2(1 + 28) 
by — = 6 


a) a (rr) 


where 





Based upon the difference in the nature of the roots of the equation 
bo + bit + bet? = 0, there have been derived thirteen types or curves. Of the 
particularly noteworthy ones, the normal curve and Type III may be men- 
tioned. The criterion for the normal curve is a3 = 6 = 0; that for Type III is 


7 Carver, H. C., Frequency Curves, Handbook of Mathematical Statistics, p. 104. 
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6 = 0anda; + 0. In order to fix the form in a particular case, we may refer 
to Pearson’s Chart 6,82 Distribution® where 


§ 2 
Ai = 3 &3,) 


and 


K= c. == Bi (Be + 3)? sitet ED 
4bob:  4(4B2 — 36,) (282 — 38, — 6)  46(2 + 4)’ 


or to Elderton’s Frequency Curves and Correlation. 


Section IV. THE INveRsSE SAMPLING, OuR PROBLEM 


It is now our problem to study the theory of inverse sampling, by which we 
mean that given the characteristics of a single sample drawn at random from a 
parent population, we wish to ascertain the probability that the corresponding 
characteristics of that parent population do not differ from those observed in 
the sample by more than a specified amount. To illustrate, suppose we are 
interested in knowing the average height of 1000 freshman students to which 
reference has already been made. Due to the fact that it takes too much time 
or is otherwise impractical to measure all of them so as to obtain the true average, 
we select at random one hundred of them and measure the heights of these one 
hundred individuals. Suppose the mean, the standard deviation, and the 
skewness of this sample of one hundred are computed and they are as follows: 


M 67.99 
o 2.327 
as . 12299 


Now assuming that the true mean of the entire 1000 heights is unknown, let 
us find the probability that the true mean of this parent population lies between 
M, = aand M, = b by what we know of the characteristics of the observed 
sample of one hundred as recorded above. It is clear that if we can obtain an 
equation, y = f(M,), of the frequency curve associated with the distribution of 
hypothetical means of this parent population, we shall be able to ascertain the 
probability we desire by evaluating the following integral expression : 


b 
f(MDdM, 
P= == acacia 
| f(MDaM, 


8 Pearson, K., Tables for Statisticians and Biometricians, Vol. II, front page. 
® Elderton, W. P., Frequency Curves and Correlation, Table VI, opposite p. 46. 
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In the same way we can find the probability that the standard deviation of 
the parent population lies between two definite limits or that the skewness of 
the parent population lies between two definite limits. 

Our procedure will therefore be as follows: First, assuming the a priori 
existence of a continuous sequence of hypothetical means of the parent popula- 
tion, we investigate the relation between the distribution of these hypothetical 
means of the parent population and the distribution of sample means. If such 
a relation exists, we shall be able to find an expression for the most probable 
value of the parent mean. Assuming the most probable value of the parent 
mean to be the true mean of the parent population, we shall obtain an expres- 
sion for the most probable value of the standard deviation of the parent popula- 
tion. Then it will be possible for us to express the frequency curve associated 
with the distribution of hypothetical means of the parent population in the form 
of f(M,). Similarly we may find the frequency functions associated with the 
standard deviation and skewness of the parent population. 

Before leaving this section, it is perhaps not out of place to say a word about 
the connection of this theory of inverse sampling with Bayes’s Theorem. The 
theory of inverse sampling (which deals essentially with the problem of judging 
the nature of a whole by observation of a part of it) belongs to the domain of 
inductive probability, or inverse probability, upon which Bayes’s Theorem was 
founded. In order to solve a problem of inductive probability, it is necessary 
to postulate the a priori existence of the causes from which an event takes place, 
which, in our case, is the hypothetical means of the parent population. 

This a priori hypothesis which gives rise to Bayes’s Theorem has been viewed 
with suspicion by a number of mathematical statisticians. For example, the 
theorem has been called into question by such mathematicians as Bing, Venn, 
Chrystal, and others, including several now living. But so far as the present 
writer is aware, no definite conclusion has been reached. It is true that on the 
one hand Bayes’s Theorem has not been rigidly demonstrated and proved by 
logic; but on the other hand the process of generalization: from observational 
data is justified within the limits of ordinary practical application. One who 
holds Bayes’s Theorem strongly may even say that the a priori hypothesis is 
absolutely necessary to scientific inferences. Concerning this controversy, 
Pearson takes a liberal point of view: “I hold this theorem [Bayes’s Theorem] 
not as rigidly demonstrated, but I think with Edgeworth that the hypothesis 
of the equal distribution of ignorance is within the limits of practical life justified 


by experience of statistical ratios, which a priori are unknown --- .”" He has 
further remarked that “the practical man --- will accept the results of inverse 


probability of Bayes-Laplace brand till better are forthcoming.’™ Using 






10 Pearson, K., On the Influence of Past Experience cn Future Expectation, Philosophical 
Magazine, Vol. 13, Jan.—June, 1907, p. 366. 

11 Pearson, K., The Fundamental Problem of Practical Statistics, Biometrika, Vol. 13, 
1920-21, p. 3. 
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Pearson’s viewpoint, we shall proceed with our problem by postulating a priori 


the existence of hypothetical means of the parent population from which our 
sample is drawn. 


Part II. Fundamental Relation between the Moments of the Distribution of 
Sampling Means and the Moments of the Distribution of the Hypothet- 
ical Means Associated with the Parent Population 


The characteristics of the distribution of sample means, as we have pointed out 
in Part I, Section II, differ from those of the sample distribution only by a 
constant multiple of (1/r)* where k is the order of the moments concerned. We 
may write down the first six moments of the distribution of sample means: 


M 


8 fia | 
ov) 
 H:2\ 
+ PsP, S82 i i 
oe 


> S Mazz Me:2 
y + PsPe va +3 











_ (s—1)(? + s — 6rs + ‘6r?) 
r(s — r)(s — 2)(s — 8) 


’ 
4:2 — 35 


a 6s(r — 1) (8 — r — 1) 
r(s — r)(s — 2)(s — 3)’ 
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If our parent population is infinite, which is a special case by allowing s > =, 
then we have 











Q3: 27 — 


1 
4:2, —_ 3 r (a4: — 3), ete. 


Let us now define f(t) as a frequency function of the distribution of sample 
means Z; In standard units, 1.e., 

















(7) t= Z; —M:, 


Oz, 





Denoting the observed mean of a given sample by m and making proper sub- 
stitutions of (5), we obtain 


(8) — m — M,z . _ => M. 


Ozy s —T?r 


V wet ” 


It is clear that if we hold s, r and oz constant and let M, vary, then ¢ is a 
function of M, only and consequently f(t) becomes a function of M ,;. 

Suppose now M‘!), M‘?), M“®, ... be a continuous sequence of hypothetical 
means, which M, has an equal chance to assume. These hypothetical means 
will certainly lie in a linear interval between their natural limits. Then the 
probability that M, lies between M, + 34dM,isf(t)dM,. Therefore, to obtain 
the probability that M, lies in the interval M$’? < M, < M‘t”, it is only 
necessary to carry out the integration of this expression: 









mii +1) 
(9) “ f(t) dM, 


(i) 
uM 





There is no question as regards the existence of this integral in case of an 
infinite parent population. As for a finite population, we may still use this 
continuous function as an interpolation function to the true discontinuous 
function. 

Let us now define P(t) as the probability function for which the hypothetical 
mean of the parent population falls within certain specified limits. Considering 
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n:p aS the nth moment of this probability function about a fixed point, we will 
have the following relation: 


l 
| M* f(t)dM, 
(10) da:p >= a 
f()dM, 
= § 


where / and —l are their natural limits. 
Since from (8), Mz = m — oz,t, then after substitution, we obtain 


m +l 


Czy ie (mi — ozgt)" f (t)dt 


o. 
=z 


m+ l 


Ou / fat 


r—1 - n —2 n —3 . 
m4 Bi:zz + (;) my B2:22 — j my 3:22 


n\ _ 
+ Stee + (—1) (") Mn:zz 


= M, = my, 


| . . 
M2:p = My + M2:zz 
\ 
3 = aes 
| Msp = my, + 3M jie: 2, — KM3:zz 
; 4 2. - o 
| M4:p = My + 6m | fle: 2, o- 4m fis: 2, + M4:zzy ete. 


The first relation M, = m, is important because it shows that the mean of 
the hypothetical means of the parent population is equal to the mean of the 
observed sample drawn from it. To state this in a theorem, we will have 

Theorem I. The expected value of a parent mean is equal to the mean of an 
observed sample chosen from the parent population. 

We now wish to express the moments of the probability function about its 
mean in terms of the moments of sample distribution. In general, the nth 
moment of any frequency distribution about its mean, j,, can be expressed in 
terms of its moments about a fixed point after the following fashion: 


(13) Bn = Pn — (") M pA + (5) M?yn-2 — Ses + (—1)" (") M*. 
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Therefore when we substitute (11) into (13) we obtain 


Ln:p _ my = (") ie Hi:zz + 6) mi? He:2, = 
n ‘ 9 n 2. 
+ (—1)" (, a :) mi} Bn—3:2z + (—1)"° (, bp >) my Mn—2:27 
n n 
4 n— = 1 n—-2 a 1 n—-3 - 
— ™m (7) E i (’ 1 ) mi Mizez + (" 9 ) mi * fia:eg — + °° 
es — ,-{n—l a 
+{(-p" (’ 7 2 M4 n-s:z2 + (—1)”* (" - 7 M, fin—2:22 
_ n—1 n— 1 a 
+ ( 1) E = ') fates 
ait = 2 n— - oo 2 n— io 
+ mi (5) | mi ~ <—e (” 1 ) mi P Ki:zz + (" 9 ) my ' M2:z23 — 
_(n—2 ~in—2\. 
oe (" a :) Mt Betae + (1 c ~ 7 fats | 
— mi (3) | mim fe (" + *) mi * jie + P a. " ee Bai wes 
_tieeh? ~ 3\_ 
+ (1 (® 8) aao| 


:; i 1 
+ (—1)"' mf a 7 1) [ms _ (;) use| 
n n 
+ (-1) fad (") 


Adding vertically each column, we obtain 
par = mi | (0) — (1) + (2) - G) #G) - + OG] 
=o mal D-C7O+C79O) CINE) 
HCF9Q--temG 2) 
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var mal())-C390+6590-C290 


+ (as 1)"— my, Bn—lez (, = :) c 


+ (— a Bn:zz 


The first row of the above expression is equal to m7(1 — 1)" = 0; the second 
row is equal to 


tl a n! n! 2 n! (n — 1)! 
— i t  Fk) Orn! Im —D! Wm—D! Il(m— 2)! 

n! (n — 2)! n! 
_ ies ils Sia a. 
(in — 2D! i!m—3)! i a 


ai acai n! 1 ia 
= — M1 iiss Fy O!(n—1)! 1!(n — 2)! 





a 





{1 ‘ 1 | 
T arn —_—_ +-8 (n — 1)!0! 


= —m _ Bizzy ri \1 —_ ant + aiCs — + (— 1)" n~1C n—i) 
n—l1 — n n—l 
= —™m, Mi:zy 1! (i _ 1) = 0; 


the third row is equal to 


mz n! nt nt (n— 1)! 
1 M2) Cin! Qin — 2)! I!(m—D! 2! — 3)! 





n! (n — 2)! sai n! | 
2!(n— 2)! W(n—4)! +i<8 (n — 2)!2! 
we. n(n —1) 
i 


= m — 
1 2:22 9! 


[1 = n—-2C} + n-2C2 a eS + (—1)"> a 


na. Mn — 


1 = 
= M1 “ pe:2, 91 dy _ 1)” “= 0; 


and similarly all the other rows turn out to be zero except the last one which is 
equal to (—1)" dn:z, 


(14) Ln:p = (—1)"Bn:z, 
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This may be rewritten as 


| Men:p => Pen:: 
(15) I z 


Ben4 ip = — Pen+1:2; 


or in standard units 


(a2n:p = AWn:zz 
| oan 41:p = — Ant+izz 

The results” of (15) are important and fundamental because they establish 
the relation between the Theory of Inverse Sampling and the Theory of Sam- 
pling. Therefore we may formulate the following theorems: 

Theorem II. The even moments of the distribution of the hypothetical means 
of a parent population about its mean are equal to the corresponding even mo- 
ments of the distribution of the sample means about the mean. 

Theorem III. The odd moments of the distribution of the hypothetical 
means of a parent population about its mean are equal to the negative of the 
corresponding odd moments of the distribution of the sample means about the 
mean. 

Since the even moments of the two distributions are the same, while the odd 
moments differ only in sign, it is evident that for symmetrical distributions, the 
two curves f(t) and P(t) are exactly identical, because in a symmetrical distribu- 
tion all the odd moments about the mean are bound to vanish. In case of 
nonsymmetrical distributions, the curve P(t) is nothing but a vertical reflection 
of thé curve f(t) as shown in the figure: 


In other words, if f(t), for instance, assumes Pearson’s Type III Function, then 
P(t) also assumes Pearson’s Type III Function except that their skewness is 
different in sign though equal numerically. We therefore state our theorem as 
follows: 


12 So far as the writer is aware, these theorems were first developed by Professor H. C. 
Carver. 
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TheoremIV. ‘The curves for the distribution of the hypothetical means of the 
parent population and the curve for the distribution of the means of the sample 


obtained from the parent population are symmetrically situated and one is a 
vertical reflection of the other. 


Part III. Inverse Sampling Associated with a Normal Parent Population 


We shall be concerned in this part of our discussion with a normal parent 
population. In accordance with the characteristics of a normal parent popula- 
tion we wish to investigate the most probable values of its mean and variance, 
thereby obtaining the distributions of the hypothetical means and variances of 
the parent population. 


Section I. Most PRoBaBLE VALUE OF THE MEAN OF THE PARENT POPULATION 
In Part I, Section III, we have mentioned Pearsonian Types of Frequency 
Curves whose differential equation is 
ldy | a—i 
tdt bo +bit+ bol 
It is clear that the mode of these curves is at t = a, provided the mode exists. 
But to recapitulate: 
— 3 
q= ———__., 
2(1 + 26) 
where 
_ 2a, — 3a3 — 6. 
a4 + 3 ’ 
consequently for the mode of the distribution of sample means, we have 
0 — 3:2 
(16) Ss 
2(1 + 26.,)’ 


where 


(17) §,. = Atti — S83, — © 


22x 


‘Mla + 3 

2(s — 1)(s — 2)(s? + s — 6rs + 67?) (a4:2 — 3) 
ie — 12e(r — 1)(e — 2)(e — r — 1) — Be — 1)(s — 3)(s — 2r)* 5.2 

~ (s — 2){(s — 1)(s? + s — Grs + 6r*)(e4:2 — 3) 
— 6s(r — 1)(s — r — 1) + Gr(s — r)(s — 2)(8 — 3)} 


ez — M. a " “ 2 / r(s e. r) 3:2 
A=? 2(1 + 26.,) 


V qn” 
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Now according to Theorem IV, the mode of the probability function P(t) is 
situated symmetrically with respect to the mode of the frequency function f(t) 
of the distribution of sample means; hence, for the mode of the probability func- 
tion of hypothetical means of the parent population, we have 


m—M, _ s—2 V r(s— 7) 
j/s—-T . 2(1 + 26.,) 
(es — 1)” 


(19) t= - 


V 


where 6,, remains unchanged because it is a function of Os, and a4;-,, each being 
always positive. 

Solving for M,, which will now be the most probable value of the mean of the 
parent population and hence denoted by M,, we have 


~ s — 2r 07 Q3:2 
— Me = ™ ~ 5) BO + Bin) 

It is interesting to note that if s = 2r, this expression yields a, = mM, irrespec- 
tive of the law of distribution of the parent population provided only that 6., 
is not exactly equal to —3. But since the Pearson’s function is used for gradua- 
tion, one should not fail to see that the mode so obtained gives only an approxi- 
mation to the true mode. Therefore we state a theorem as follows: 

Theorem V. If asample is composed of one-half of the variates of the parent 
population from which the sample is chosen, then the best approximated ‘most 
probable value’ of the mean of the parent population is equal to the mean of the 
observed sample provided only that 6,, is not exactly equal to —3. 

It is further observed that if a3.. = 0 but 6., ~ —4%, then the expression (20) 
will likewise yield M, =m. But azz = 0 implies that the frequency curve of 
the parent population is symmetrical. Hence 

Theorem VI. For any symmetrical curves associated with the distribution of 
the parent population, the best approximated ‘most probable value’ of the mean 
of the parent population is equal to the mean of the observed sample provided 
62, is not exactly equal to —}. 

But we will investigate further the most probable value of the mean of a 
normal parent population, and we know that in a normal distribution the 
moments bear the following relation :'% 





(2n)! 
|\Qon = 

(21) 2" n! 
loons = 0 


13 Carver, H. C., Frequency Curves, Handbook of Mathematical Statistics, p. 97. 
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ete. 
Consequently for a normal parent population the a,, function in (17) is 
immediately reduced to 


28(r — 1)(s — r — 1) 





(22) 6: 


Let us, first of all, investigate the possibility that this expression will be 
exactly equal to —3 for positive integral values of r and s. 
Suppose we set 


2s(r — 1)(s — r — 1) a J 
sv — 1)(s ae ee 1) —_ r(s ~ r)(s one 2)(s = 3) 2 


and solve r in terms of s. Thus we obtain 


(23) ae V 2(s? — 10s + 6) + 20s (s me 1)(s? — 10s + 6) 
2 2(s? — 10s + 6) 

If s = 10, then the second term on the right side is positive. As it is absurd 
that r should be greater than s, therefore the positive sign of the double sign 
s 
2 
right member will be negative. Since r cannot be negative, no positive integral 
values of r and s, for which s 2 r, can satisfy (23). For s < 10, there are only 
nine positive integers; and direct substitution of each will tell us that only when 
s = 1, 2, or 3, ris a positive integer which is either 1 or 2. As these are trifle 
cases because a parent population can never be so small, we may safely say that 
for a normal parent population 


(24) M,. = m 


should not be taken. Then, as the second term is obviously greater than —, the 


Theorem VII. For anormal parent population, the best approximated ‘most 
probable value’ of the mean of the parent population is equal to the mean of the 
observed sample from it. 

For an infinite parent population, i.e., s — © (20) yields on reduction 


’ 7 1 Oz A3:2 
2 2 cm 2 le... 
= m™ —~ > O10 + 26.,) 


where 


5, = 2(a4:2 - <i Po 30:3 :2 [(from 17)' 
(a4: si 3) _ 6r 








80 CHING-LAI SHEN 


Formula (25) yields immediately M. = mif a3, = Oand 6., # — 3. Fora 


normal parent population 6., = 0. Hence Theorem VI and Theorem VII both 
hold for the infinite case. 


Section II. Most PrRoBABLE VALUE OF THE STANDARD DEVIATION OF THE 
PARENT POPULATION 


To find the most probable value of the standard deviation of the parent popu- 
lation, we shall assume the mean of the parent population to be the best ap- 
proximated ‘most probable value’ of the mean, which we have obtained in the 
preceding section. This assumption is necessary since we do not know the true 
mean of the parent population. 

Now, to start with, we shall consider ,C, possible samples, each consisting of 
r variables. The second moment of each sample computed about the best 


approximated ‘most probable value’ of the mean of the parent population may 
be written as 


1 . . 2 2 
21= r {(a1 — mi)? + (ae — my)? + (x3 — m)* + --- + (a — m)*} 


1 . : 
Ze = —{(xe — m)? + (23 — m)? + (a4 — mM)? + --- + (Gray — mM)?! 


{(2s—r4a — M1)? + (H5-r42 — M1)? 


= 1 


+ (%-r43 — ™)? + --- + (4%, — m)?} 
If we write (x; — m)? = y;, it is clear that the above may be considered as a 
distribution of sample means drawn from a parent population y1, y2, Ys --* Ys: 


and consequently 


M 


M, 


zy 





y imei 


Co = 0 iitenetiten ee ~ | 
v= GT 
_o» oT 
7 03:24 s — 27 ans. ities 
(26) s—2 r(e — r) 
—— (s — 1) (* + s — 6rs + 67") ™ 
eT fen Hew Doe | om 


7 6s(r — 1) (s — r — 1) 
( r(s — r)(s — 2)(s — 3) 





«o 9 
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Now the nth moment of y about a fixed point may be written as 
1 1 - 
NU = yh — mP 
1 . 
N > {(2 — M,.) + (Mz — m)}*" 
2 
Ben:z + ( ) Hen—1:2 (M, oo my) 


+ (3) H2n—2:2 (M, or m,)? * (2) em (M, a m,)3 
2n\ _ ; 
- ( ) fonts (Mz — ma)* + «++ + (Mz — m)™. 


On the assumption that our parent population is normally distributed and 
due to the fact that in a normally distributed function 


y! 
An = aes and Qn = 0 [See (21)], 


the expression (27) immediately takes this form: 


_ Qn! s¢, 2n\ (2n — 2)! «tial 
Haw = eg ®t oe on ee, 
(28) 


, — ! 
+ (7) a= (Mz — m)*oz"~* + +++ + (Mz — mi)”. 


[ Imposing the condition mentioned at the beginning of this section (i.e., Mz 
assumes its best approximated ‘most probable value’ m), then all the terms drop 
out except the first one. Hence, as a final form, we have 


2n! on 
(29) Bay = nl Co; 


2 


Zz 


M:y = My=a 
be: = 303 
K3:y = 150% 
Mary = 10508 


etc. 
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It follows that the kth moment of y about its mean will be 


— M,)* k k 
Bey = 2X (y — My) = peg = (‘) Mk—l:y M, + (5) Mk—2:y M;, 


N 
— + +(-D (;) M; 


‘ae 2k! 2k k (2k se 2)! 2k k (2k = 4)! 2k 
Fm ~“ajaag—pi™ * \e)Jae2E- mi” 


am 088 _ (—1)* tg 
= at 2k! (‘) (2k — 2)! (5) (2k — 4) 
— 9% joe \aJ aa — nit \2) a=@— 2) 


mee (| 
al k k(k — 1) 


~ 11@k — 1) 7 21@k — 1) @k — 3) 
k(k — 1) (k — 2) 
~ 3Y(2k — 1) (2k — 3) (2k — 5) 





k! 





jis:y = 6040 0? 





etc. 


And therefore we obtain 
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Making proper substitution of (30), (32), (83) into (26), we obtain 


_ 2(s — 2r) . /2(s — 1) 

a va 

12 (s — 1) (f + ¢ — 6rs + 6r?) — 6s(r — 1) (s — r — 1) 
r(s — r)(s — 2) (s — 3) 

For an infinite parent population, i.e., s — «©, we have 


(M., = 0? 








| ity — 3 = 


Now again with reference to Pearsonian Types of Curves for which the mode 
is at £ = a, we have for the mode of the distribution of sample means 2,, 


zy — Mz, _ O3:2y 


= — I ‘Oo 
2(1 + 26.,) where 


(36) t= 


Czy 
= 2ets:2y — 303 :2y — 6 
A4:2y aad 3 
—_ (s — 3) [4(s — 2r)?(s — 1) + 2r(s — 2)*%(s — r)] 
(s — 2)(2(s — 1)(s? + s — 6rs + 6r?) 
+ r(s — r)(s — 2)(s — 3) — s(r — 1)(s — r — 1) 


Substituting (34) into (36), we obtain 


2,— 6, s — 2r 2(s — 1) 1 
38 eewenell “ = — ‘ 
(38) /As —") 3 s—2 4/22 —r) 14+ 26., 
V e— 1)” 


By Theorem IV, the best approximated ‘most probable value’ of the standard 
deviation of the parent population is obtained from (38) by changing the sign 
of the right member and replacing z, by m,. Thus we have 


=2 








___ M™, — o; += 2r ‘Ws — 1) 1 
‘2(s — r) o? s—2 rs—r) 1+ 26,, 
V =)” 
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Solving for o,, which is now the best approximated ‘most probable value’ and 
should therefore be denoted by ¢,, we then have 





i - a me 
(39) OC, = Pazz ; “As _ 2r) 
+ r(s — 2)(1 + 26,,) 


The best approximate ‘most probable value’ of the standard deviation may 
therefore be written down as 


. _ 
é- = —— ——- where o, = Vm 


“2(s — r) 
© + re — 2)(1 + 25.,) 


This formula is, of course, subject to a systematic error that arises from the fact 
that we employ the square root of the best estimated ‘most probable value’ of 
the variance. Itmay be shown, however, that when ris large, the error is small." 
Consequently, we have the following theorem: 
Theorem VIII. For a normal parent population, the best approximated 
‘most probable value’ of the standard deviation of the parent population is 
equal to 


= a. 
As - — 2r) 
> 1 + 7 — Hd + 2.) 
where og, is the standard deviation of an observed sample from the parent popu- 
lation and o-, is a function of r and s as expressed in (37). 

It is interesting to note from (39) that when s = 27, ¢, = o, provided 6:, ~ —}. 
However, from (37), 6., cannot be equal to —% in the case of s = 2r, where s 
and r are both positive integers. Consequently, we may state this fact in 
another theorem: 

Theorem IX. If a sample is composed of exactly half of the variates of a 
normal parent population, then the best approximated ‘most probable value’ 
of the standard deviation of that parent population is equal to the standard 
deviation of an observed sample from it. 

For an infinite parent en (39) yields on reduction 


(40) oz = as V ; ae 5 for o:, = 0 when s> ~. 


14 Professor H. C. Carver has worked out a relation between the most probable value of 
x? and that of z by assuming that the latter is distributed according to a Type III distribu- 


tion. With his permission, I state the result as follows: 





ae =ay 


M,. ' ca 
where \ = and M, = the distance of the mean from the origin. 
Oz 


M.P.V.2 = (M. P. V. 2) ( ~ 
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Theorem X. For an infinite normal parent population, the best approximated 
‘most probable value’ of the standard deviation of the parent population is 


equal to the standard deviation of an observed sample multiplied by VV —: 
. 


Section II]. DistripuTion oF THE HypoTHEeTicAL MEANS OF THE PARENT 
POPULATION 

In the preceding two sections, we have obtained the best approximated ‘most 
probable value’ of the mean and the best approximated ‘most probable value’ 
of the standard deviation of a parent population assumed to be normal. We 
are now in the position to characterize the distribution of these hypothetical 
means by assuming that the best approximated ‘most probable value’ of the 
mean of the parent population be its mean and the best approximated ‘most 
probable value’ of the standard deviation of, the parent population be its stand- 
ard deviation. Such a characterization is subject to its own probable error. 

Due to the fact that our parent population is normal by assumption, formulae 
(4), which we are to use this time, have to be modified by the proper substitution 
of the recursion relation of the moments of a normal distribution [See (21)]. 
After such modifications, they assume the following forms: 


(M., = M; 
| 

a <, Pails 
| 0 

38 vor 
— (P, + P38)ji2:2 


0 


Facey = (Ps + 3PiPes + P3s°)p3 


In accordance with Theorems II and III, we therefore have for the distribution 
of the means of the parent population the following: 


(Mu, =m 


Me: M, 
:Mz 


| 
* 
) ita 


15 
“= fie, ™ > (Py + 3P,P28 + P38")h2.2 


15s 


** (Py + 3PyPos + P3s*)ub., 
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Consequently 


( Mx, => m, 
| 
| 
| 


§ —TIr a 





— V r(s — 1) ” 
(43) 
a3:M, = 0 
| 2. 68(r — 1)(s — r — 1) 
| @4:M, — 5 = 


r(s — r)(s — 2)(s — 3) 
For an infinite parent population, i.e., s > ©, we have 


( Mu, — ty 
Lz 1 a Cs 
Om, ae Ox - as — 


Vr wewvrst”} 6eet 


| 
| o. 
| Q3:Mz = Q 











, [from (40)] 
(44) 


| cessaty —3=0 





Now if we can find the equation of the curve associated with the distribution 
of the means of the parent population, we shall be able to ascertain the prob- 
ability that a mean lies within certain limits after a sample from the parent 
population has once been observed. 

Let us illustrate this by again referring to the same problem of the heights of 

; 1000 freshman students as recorded in Table I. Considering this as our parent 

population which is almost normal with s = 1000, we take every tenth indi- 

vidual height from the original list in which the 1000 heights are tabulated. 

Thus we obtain a sample with r = 100. The frequency distribution of these 100 

individual heights is shown in Table IV. 










TABLE IV 
Sample of 100 Heights Selected from the Parent Population of 1000 from Table I 














Class | Frequency 
62. 5-64. 4 9 
64.5-66.4 | 16 
66. 5-68. 4 31 
68 . 5-70. 4 29 
70.5-72.4 13 
72.5-74.4 2 
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We compute the mean, the standard deviation, the skewness, and the fourth 
moment about the mean of this sample: 


From Theorem VII, 


From (37) and (39), we obtain 


Substituting into (42), we have 


Mu, 
Ho:Mz, 
Ms:Mz 
Ha:Mz 
Ms:Mz 


He:Mz 


m, = 67.99 
m = 5.415,2 o. = 2.327,058 
ms = —1.549,872 as., = —1.229,91 
ms, = 71.615,158 a4; = 2.442,17 
M, = 67.99 
62, = —.099,833 
fe:r = 5.328 ,067 
= 67.99 
= .048 ,000 ,603 ,6 ou, = . 219 ,09 
= 0 a3:Mz, = 0 
= .006,898 ,429 Gum, = 2.994,03 
= 0 5:Mz => 0 
= .001 ,649 ,027 ag:m, = 14.910,37 


The coefficients of Charlier’s Type A Function (3) are as follows: 


Cc 


w 


Sila 

2 = — 000,250 
C5 

“= 

Ce 

“ _ 900,000,1 


From the values we are justified in assuming that M, is normally distributed. 
We may now ask ourselves concerning the probability that the mean of the 
parent population, M ,, from which this sample is selected, exceeds 68.5 inches. 


im M.— My, _ 68.5 — 67.99 


om. al 





P 


| o(t) dt = .009962 
2.3278 


327 
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Let us now come back to investigation of the general case for the distribution 
of the hypothetical means of the parent population. Because there is no definite 
relation between the values of r and s, except r S s, and because, by assump- 
tion, our parent population is normal, 6,, is a function of r and s (22); that is 


5 _ 2s8(r _ 1)(s = r — 1) _ 

” s(r — 1)(s — r — 1) — r(s — r)(s — 2)(s — 8) 
Consequently, it is necessary for us to investigate for different values of 62, with 
respect to various combinations of r and s before we can tell which Type of 


Pearson’s Curves will best fit the distribution of the means of the parent popula- 
tion. Hence, Table V: 





TABLE V 
Relation of the Values of 62, with Various Combinations of r and s 





r > 100, .. = —.0020 
s=10r4r = 50, — .0040 
10, — .0189 


IV IV Il 
IV IV Il 


100, 
50, 
10, 


.0040 
.0080 
.0397 


IV IV IV 
IV IV IV 


100, 
50, zx 
10, é; 


“Zz 


.0101 
.0204 
.1118 





-— 
"2 
2 


| IV AV IV 





r+ 1, r = any finite value, 6., = 0 
0 


any finite value, r=1 62. = 


“Zz 


0, r = any finite value, 6-, = 0. 


From the above table we observe: 

1) For an infinite normal parent population, the frequency distribution of the 
hypothetical means of the parent population is normal, because both as3:y, and 
5-, are equal to 0 (See Part I, Section III). 

2) For any finite, normal parent population, if r = 1, the frequency distribu- 
tion of the hypothetical means of the parent population is normal. 

3) For any finite, normal parent population, if a sample r = s — 1 is chosen, 
the frequency distribution of the hypothetical means of the parent population 
is normal. 

4) For any finite, normal parent population, if s is equal to 5r or more and at 
the same time r is at least equal to fifty, the normal curve is a fair approximation 
for the distribution of the hypothetical means of the parent population. 
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5) For the other cases in which | 6,, | is not negligibly small, we ought to make 
further investigation. 

Now, to carry out further investigation for the cases where | 6,, | is not very 
small, we need only look back to formulae (43), from which we observe that: 
au, — 3 <Ofors #Ar-+1,r ¥ 1, ors does not approach infinity. 

Because of the fact that a3.¥, = 0 and a4.y, < 3 is the criterion for Type II," 
we conclude that Type II will be the best fitting curve for the cases mentioned 
in 5) above. To obtain this Type II curve we proceed as follows: 

Let the equation of the curve associated with the distribution of the hypo- 
thetical means of the parent population with which we are concerned be 
y = Py,(t). Then 


i ic a 
ydt bo + bt + bf  —b(t + R)(R — 2) 





where 
pa bt Vb? — 4bobs 
2be 





By proper substitution with the formulae in Part I, Section III, we obatin 


—%:M, Va 2 ue — 48:(2 + 5z2) 











R = = a 
26., 
(45) at 
= + Vz a since a3:v, = 0 from (44) 
For the same reason a = i+ BD = 0; therefore the differential equation 
may be rewritten as 
Ldy _ t 


ydt ~ b,(R — By’ 
from which we obtain 


(46) y = yo (R? — #*?)2 where q = — — _* 25:5 
2bs 5, 








Imposing the condition that the total area under the curve be equal to unity, 
we set 


R R 
l= | ydt = yo (R? — t?)2 dt 
ai — 








18 Elderton, W. P., op. cit., Table VI, opposite p. 46. 








90 CHING-LAI SHEN 






Substituting t = — R + 2Ru, we have 


1 
(nm | (QR) ye — p)e du 
0 
= yo(2R)** Bq + 1,4 +1) 
oe 1 T'(2q + 2) 
ee” 





(2k Tq+1)rqt) 


hence 



































T'(2q + 2) ae 
= A Rk? ate t? q 
a ”’= eper@+D)Ir@t) | 
- 1 —_ T'(2q + 2) ¢ __¢ y 
gt14/29g +3 Mqt+1rqt+) 2q + 3/’ 


where g may be expressed in terms of r and s by means of (46) and (22). Thus 


(48) q = ~1_ 427 —*)(s — 2)(s — 3) — 58(r — 1) (8s ~ 7 — 1) 
he 2s(r — 1)(s — 7 — 1) 

To sum up: In describing the distribution of the hypothetical means of a 
parent population from which our sample is chosen, we have the following 
theorems: 

Theorem XI. The frequency distribution of the hypothetical means of an 
infinite, normal parent population is normal. 

Theorem XII. The frequency distribution of the hypothetical means of a 
finite, normal parent population is normal if r = s — 1. 

Theorem XIII. The frequency distribution of the hypothetical means of a 
finite, normal parent population is very nearly normal if s is equal to 5r or more 
and r is at least equal to fifty. 

Theorem XIV. The frequency distribution of the hypothetical means of a 
finite, normal parent population is according to Type II for the cases in which 
| 6., | is not negligibly small. 


Section IV. PROBABLE ERROR OF THE MEAN 


To measure the fluctuation of a sample mean from the true mean of the parent 
population, it is customary to use the term “probable error” to denote the 
expression : 


(49) Eu = 0.6745 Ti 





where c, is the standard deviation of the parent population. As the true value 
of o, is not known, it is the common practice to substitute for it the value 








/ ; D i o., where oc, is the square root of the expected value of the sample 


second moment. 
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Therefore (49) is rewritten as 


(50) Ey = 0.6745 ——— 
a 
Still, it should be noted, this expression is an approximation. Now from our 


theory of inverse sampling, as far as a normal parent population is assumed, 
we have obtained for the probable error of the mean 


(51) Emu = 0.6745 T= —- 


where o; is definitely the standard deviation of an observed sample. Although 
for large r, (50) and (51) do not differ much, yet (51) is obtained directly in 
terms of the standard deviation of an observed sample. 

To illustrate, consider the same sample of the heights of 100 freshman students 
(See Table IV) as obtained from an infinite parent population. Since the mean 


is 67.99 and the standard deviation is 2.327058, the probable error of the 
mean is 


Ey = 0.6745 x 2:327098 _ 1554159: 
4/102 


that is, M, = 67.99 + .1554152, which shows that the chances are even that the 


true mean of the parent population lies within the range 67.834,584,8 and 
68.145,415,2. 


SecTIon V. DiIstTRIBUTION OF THE HYPOTHETICAL VARIANCES OF THE PARENT 
PopPpuLATION 


Recalling the fact we have stated in Part III, Section II, that the considera- 
tion of the distribution of the second moments of samples about the most 
probable value of the mean is equivalent to the consideration of a distribution 
of sample means drawn from a parent population y2, yi, Ys, --- Ys, Where y; = 
(x; — m)* since in a normal parent population M, = m, [See (24)] we can write 
down in perfect analogy with (12) and (14) 


Lnip = (—1)” Bnizy 
M, = m 


(52) 


Now 


> (ai — mi)? 


Mn:ip = Mn:My = Mn: — 7 —— = Patiise 
N 


since we have assumed the mean of the parent population to be its most probable 
value, i.e., mm. Hence by virtue of (52) and (34), the frequency distribution of 
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the hypothetical variances of the parent population, which is assumed to be 
normal, is characterized by 


Mi:2 = Me 


(53) Te 2s — r) ht es ‘2(s — r) Pe 

re Ke "a5 
since we assume the most probable value of the variance of the parent population 
to be its variance. 















_ 2(s — 2r) ‘He — 1) 

t—2 r(s — r) 
12 (s — 1) (s?+ s — 6rs + 6r*) — 6s(r —1)(s—r an fi) 
r(s — r) (s — 2) (s — 3) 


Q3:jo:2 = — A:zy = 








O4:ji5:2 — 3 = 4:24 — 3= 












For an infinite parent population, i.e., s > ©, we have 


| M;... = Me 


_ 12 


| 4:2: -— oo = — 
[ Aiporz r 


Now if we can find the equation of the curve associated With the distribution 
of the hypothetical variances of the parent population, we shall be able to 
ascertain the probability that a variance lies between certain specified limits 
after a sample is drawn from the parent population. 

For illustration, we will use the same sample of the heights of 100 freshman 
students (See Table IV) as selected from a parent population of 1000. 














We have s = 1000, 
67.99 


mz = 5.4152, org, = 2.327058 


= 100 





m = 








From (37) we compute 








6:, = —.0098 


y 


As | 6:,| is negligibly small, we may be justified in considering j:, to be 
distributed according to Type III (Part I, Section ITI). 
It follows from (39) that 







fixe = 5.32 
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We compute the moments of the distribution of the hypothetical variances in 
accordance with (53). Thus 


Miz. = 5.4152 
Cine = .506 
Q3:jx7 = - 239,946 
Uj = 3.055,75 
If we now wish to ascertain the probability that the variance of the parent. 
population lies between jz. = a = 5.5 and fiz., = b = 6.5, we first convert a, b 


into standard units such that ¢, = .1525 and t, = 1.9511 and then evaluate the 
following integral :! 


4 


2 — 4 2 
(2) Se fis /9 eS 

poled pm (a gy, 
"(3 1525 \Q3 


But this step is now not necessary since we have access to Tables of Pearson’s 
Type III Function.” Hence we find from this table our desired probability. 


P = .39146 


In the above numerical example, we are justified in using Type III because 
5.,| is negligibly small. But for the general case, however, we ought to make 
further investigation concerning the values of 6:, . 


TABLE VI 
Relation of Values of 6:, with Various Combinations of r and s 
(r 
10r 47 
\r 








100 
50 
10 


— .0098 
— .0194 
— .0859 


IV IV IV 
IV IV IV 


.0200 
.0400 
. 1983 


IV IV IV 
IV IV IV 


.0518 
.1073 


10 . 1642 


I} IV IV IV 
IV IV IV 


fs 
\? 





s— ©,r = any finite value, 6., = 0. 


-— 


16 Elderton, P. E., op. cit., p. 90. 
17 Salvosa, L. R., Tables of Pearson’s Type III Functions, Annals of Mathematical 
Statistics Vol. 1, No. I. 
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Recalling that 6., is a function of r and s such that 


i (s — 3){4(s — 2r)*(s — 1) + 2r(s — 2)2(s — r)} 


’ ~ (s — 2){2(s — 1)(s? + s — 6rs + Gr?) + r(s — r)(s — 2) 
(s — 3) — s(r — 1)(s — r — 1)} 





we construct Table VI of 6., for different combinations of s and r. 

From Table VI we observe the following facts. 

1) For an infinite, normal parent population, the distribution of the hypo- 
. thetical variances of the parent population is according to Type III. 

2) For a finite, normal parent population, if s is at least equal to 5r and r 
at least fifty, the distribution of the hypothetical variances of the parent popula- 
tion is very nearly according to Type III. 

3) For the other cases in which 6., is not small but negative in sign, the 
distribution of the hypothetical variances of the parent population needs further 
investigation. 


9 


From Part I, Section III, k = Ba ca 


greater than —2, therefore whether k is positive or negative depends upon 
whether 6 is positive or negative. 

Now from Table VI we observe that 6., seems to be always negative; hence k 
is negative. In accordance with the criterion for fitting curves, the frequency 
distribution of the variances of a normal parent population in such cases is 
according to Type I, which takes the form:*® 


; and since we know that 6 is always 


Tr 





1 
(mi+-m2+2) aie - (t _ R y(R = {j= 
Dnt) P (met) (R, — ee 2 1 


(55) y= 


where 


_a— Ke ss ge 
bo(F2 — Ri)’ * ~~ bo(R: — Re) 


m = 


R,, Re are the positive and negative roots, respectively, of the equation bp + 
bt + bet? = 0 and can be expressed in terms of the first four moments: 


RR. = Be V 0} — 46(2 + 8) 
yan See 
26 

We may sum up the foregoing in the following theorems: 

Theorem XV. The frequency distribution of the hypothetical variances of an 
infinite, normal parent population is according to Type III. 

Theorem XVI. The frequency distribution of the hypothetical variances of a 
finite, normal parent population approximates to Type III Curve if r and s are 
of such combinations that | 6.,| turns out to be negligibly small. 

Theorem XVII. The frequency distribution of the hypothetical variances of 


18 Elderton, W. P., op. cit., p. 54. 
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a finite, normal parent population is according to Type I in case that 6., is not 
very nearly equal to zero and is negative. 


Part IV. Inverse Sampling Associated with a Parent Population Distributed 
According to Pearson’s Type III Function 


Instead of a normal parent population as we have assumed throughout our 
discussion in Part III, we shall assume in this part a parent population which is 
distributed according to Type III. Therefore, besides the distribution of the 
hypothetical means and that of the hypothetical variances of the parent popula- 
tion, the distribution of the hypothetical third moments will also be considered. 


We shall carry out our discussion in practically the same way as we have done in 
Part III. 


Section 1. Most PROBABLE VALUE OF THE MEAN OF THE PARENT POPULATION 


We have already obtained a general expression for the most probable value of 
the mean of the parent population: 


“ s— 2r O7rQ3: 
M, =m — nee 


r(s — 2) 2(1 + 26,,) 


where as before 


But we are now concerned with a parent population which is distributed accord- 
ing to Type IIT. 

Since the recursion relation of the moments of Type III distribution is of the 
form 


(56) Ani = n (a. + * an) 


3(1 + vy) where y = ° 
= 2a3(5 + 37) 
5(3 + 137 + 67’) 
3a3(35 + 777 + 307’) 
= 7(15 + 170y + 261y* + 907°) 
= 4a;(315 + 1652y + 20077? + 6307) 
9(105 + 24507 + 84357? + 8658y* + 25207‘) 
5a3(3456 + 352667 + 919717? + 829627* + 226807) 
= 11(945 + 393757 + 2522457? + 5377777' + 43749074 + 1134007°) 
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it follows from (5) that for a Type III distribution of the parent population 


| M., =M, 


joan 4/ Babe 
; r(s—1) 


(57) ont fo-i 
| Q3:2y — tno shoe 9 r(s — r) ao 


is 1)(s? +s — 6rs + 6r*) O52 - 6s(r _ -—1j(s-7 yr —1) 
r(s — r)(s — 2)(s — 3) 2 = r(s — r)(s — 2)(s — 3) 





4:25 o> 


Therefore for the most probable value of the mean of the parent population, 
we have the same form as (20): 


— 2r 


\ Cx,Q3: 
M, =m — —othaee 


ons — 9) 20 + &,,)’ 


except now instead of (17) 


(58) 8 =<2— Stat © 


‘ie = 3 


lo 3){2(s — 1)(s — 2r)? O3.2+ 8r(s — 2)’ (s — r)} 


(s — 2){(s — 1)(s? + s — 6rs + 6r)a?., 
+ 4r(s — 2)(s — 3)(s — r) — 48(r — 1)(s — r — 1)} 





We observe that if a3., = 0, this comes back to the case of normal parent 
population which we have already treated in Part III. 

But if s — © while a;., is finite, then 6., = 0. Therefore, for the limiting 
case, 1.e., when the parent population is infinite, we have 


(59) M, = m — : Oz 3:2 
2r 


Since o, and az,, are not known, we impose the condition that they assume 
their best approximated ‘most probable values’ respectively. Hence, we rewrite 
(20), (59) in the following forms: 


(60) M,=m — 2? __ Fs a Gs:2 
r(s — 2) 21 +- 28.,) 


where now 
a (s — 3) {2(s — 1) (s — 27)? a3., + 8r(s — 2) (s —r)} 
60b 6, = 2 — eee . 
om ' (s — 2){(s — 1) (#8 +s — 6rs + 6r°) d3., 
+ 4r(s — 2) (s —3) (s — r) — 48(r — 1) (s — r — 1)} 
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and for the infinite case 
(61) M,=m , Gr a 
—— a -, ~* 3:2 


So we state our theorem: 

Theorem XVIII. For a parent population which is distributed according to 
Type III, the best approximated ‘most probable value’ of the mean is the mean 
of an observed sample from it minus a correction factor which is a function of 
r, 8, Gz, and zz. . S 

It is also interesting to note that when s = 27 and 6,, # — 3, then M = m. 


Section II. Most PrRoBABLE VALUE OF THE STANDARD DEVIATION OF THE 
PARENT POPULATION 


We consider, as we have done in Part III, Section II, ,C, possible samples, 
each consisting of r variates chosen from a parent population s. The second 
moment of each sample computed about the most probable value of the mean of 
the parent population may be written as 


f(y — Ma)? + (ae — ML) + + + e — MOY 


1 - o ‘ 
— {ae — M1)? + (ea. — Ma)? + +++ + (trun — MY} 


. {(te—r44 _ M.)? + (ae-r42 —_ M.)? + ner + (as _ M.)*} 


If we write (x; — M,)? = y;, the above may be considered as a distribution of 
sample means drawn from a parent population y1, y2, ys, --- ys. Therefore, 
as (27), 


~ 2 a 
) Pen—1:2(Mz _— M,) + (7) fen-2:2(M, ina: M.)? 


fen—3:2(M x a M.)* + ee + (M, - M,) 


When we impose the condition that the most probable value of the mean of 
the parent population be its mean, then the above yields 


May = Ben:z 


Bliy M y He:z 
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Consequently 


— M,)* k k ° 
Bkzy = dy N My) = BPkiy — (‘) Mic—1:y M, + a Mk—2:y M,; 








— ++» + (—1) M! : 
7 k\ _ - k\ _ al 
= Pek:z — 1 M2k—2:2 M2:2 — 2 Mek—4:2 Mo:z 
om ive ee ee... I 
Now from the fact that we assume a Type III distribution for our parent 
population, therefore we have ( 


= 4:2 — Be:s oe (3¥ + 2)o; 


= 


| 


(62) — Ms:x — 4 ite: 2 He: 2 + 6f4-2 iis: = a... 
| = (63073 + 17077? + 9487 + 60)c>; 


B3:y = Mer — Bits: He:2 + 2ie: i (307° > 567 - 8)o%. 





etc. 
Substituting (62) into (26), we have 


( 2 
M., = 6; 








2 i Bx 2. : 
, ™ Fs V r(s — 1) (3y + 2) 


ie et s+1 3077+ 567+8 

(63);°°" 3-2 V r(s — r) (3y + 2)3” 
lone, 3 = STV +5 — brs + Gr) 6307? + 16807 + 9127 + 48 
| — = — - pecan seen aa 


ris —r)(s—2)(8—3) (3y + 2)? 
6s(r —l)(s-—r-— 1) 
~ e=—e) (es — 2 (le — 3) 











For an infinite parent population, the above yields by allowing s > « 


/3y +2 
/ r 
3077 + o6y + 8 
mY /y (BY + 2) 
» _ 16307? + 16807? + 9127 + 48 
“or (3y + 2)? 
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In accordance with (38), we write 


_s—2r (el 307° + 56y + 8 
(65) is «tt Fe) ee 


: 2(1 + 26.,) 
o; en y+) 


It follows from Theorem IV that for the mode of the standard deviation of the 
parent population, we have 
> (ce — M.)? as s—2r  /s—1 30,4 56748 
— See ee co, a an ms \2 
r r(s—r) (dy +2)! 





a2 2(1 + 26:,) 
MW wea 1) (37 + 2) 


where 
(67) 5, = 2es.2y — 3034 — 6 
’ 4:2, + 3 


which is a function of 7, s, and a3.z. 


Assuming the best approximated ‘most probable value’ of as.2 for a3:2 and 
remembering that 


tan ==. oe... 
r(s — 2) 2(1 + 26.,) 
we write (66) in the form of 


O25. — 307? + 567 + 8 a2 


Mm, + g° x -~, *§———————_-- 6, 
(1 + 26.,)? (34 + 2)(1 + 26.,) 


(68) _ Me 
7 + 567 + 8 29°7 
rise : 2)(1 + 25.) (1 + 26.,)° 
where 
_ s—2r 
~ Qr(s — 2)’ 
(60.b) where a;., is replaced by 43.z 


(67) where ag:z is replaced by @3:z 


2 
3:2 


2 
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We rewrite (68) in the abridged form: 





(69) g? = iin 


z 
(3:2, r; s) 


or 


where 


ces 304? 307 + 56F +8 29° 
$d) = W140 "(BF +21 +25.) (1+ 2.) 


and state our theorem: 

Theorem XIX. The best approximated ‘most probable value’ of the standard 
deviation of a parent population which is assumed to be distributed according to 
Type III is equal to the standard deviation of an observed sample of it, multi- 


lied by ——————_ 
$(a3:2, r; s) 


. las , ks 
For an infinite parent population, g = 5p? 6., = Oand 
, 


os _ 2(6304% + 16804? + 9124 + 48) (34 + 2) — 3(307" + 567 + 8)° 


zy en oe a Ss) ee ee ae Ss RE, Ee ee ea 


(34 + 2)[(63043 + 168052 + 9124 + 48) — 6r(34 +4 2)21 


Theorem XX. The best approximated ‘most probable value’ of the standard 
deviation of an infinite parent population which is assumed to be distributed 
according to Type III is equal to the standard deviation of an observed sample 


of it, multiplied by ——— ies —, 
lim (3:2, 1, r,s) 


$s —>0 


Section III. Most ProBaBLeE VALUE OF THE SKEWNESS OF THE PARENT 
POPULATION 


Let us again consider ,C, samples, each consisting of r variates chosen from 
a parent population s. The third moments of each sample computed about the 
most probable value of the mean of the parent population may be written as 






: (ry — Me)? + (a — Ma)® + --- + (2, — 1} 










1 


= — {(r2 — M,)3 + (x, — Mz)? + --- + (ene ~ M,)3} 


— : { (Gor 4t = M,)3 (o-+42 = M,)* + pres + (2. cae M,)3} 






















of 
qu 
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If we write (x; — M,)* = w;, the above may be considered as a distribution 


of sample means drawn from a parent population wi, we, Ws, - 


-- w,. Conse- 
quently in accordance with (5), we have 


sian i 2-8. o 

Zw V r(s aan 1) 7 
om fe fae] 
ee ————— Q3.w 
s= 3 r(s — r) . 


(M., = Mo 
| 
| 


(71) 


3:2, = 


| 
lee. ~3 = Sr VE +8 — Ors + 6r’) 
‘Geel r(s — r)(s — 2)(s — 3) 
6a(r — 1)(s — r — 1) 
~ rs — 1)(s — 2)(s — 3) 


1 Os: —_ 3} 


Let us write the analogous form of (27): 


Baw = : >" = 2 (se -— MM. + MM, = M1," 


3 ae 
(72) = jisn:e + ( Yon tal Ma ~ M.) 


+ (*) Hsn—2:2(M — M,)? + ae + (M; oe M.)* 


Imposing the same condition as before that M, assumes its most probable 
value (i.e., M, = M,), then (72) becomes 
(73) Kaw = Bsn:z 
Mi:w = M. = Ms:z 


The kth moment of the distribution of w about its mean will then be 


— M,,)* k 
Bk:w sa a Fe)! = Pk:w — (‘) Mk-1:w M. 


/ + (5) Mk-2: M;. Weare + (— 1)* Ms 
(74) A 


k k 9 
= Msk:x — (‘) B3k—3:2 3:2 + (5) Bsk—-6:2 B3:2 


wee a (—1)* f5.2 
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Since we assume a Type III distribution of the parent population, we haye 
in accordance with the recursion relation (56) 

















M., = p3:2 = A3:z a 
He:w =— a i. — (15 + 637 + 307”) a: 
= es = Siie:z H3:2 + 943: 


= ay.2 (1215 + 64177 + 79387? + 2520y*) o? 





Ba:w = Bie:z — 4 fig. 2 B3:2 + 6ic:2 a. = Diss. 

= (10395 + 423225, + 27225997? + 58516835 
+ 479223074 + 1247400y°) o? 
Substituting into (71), we have 


z 


|Mi, = Ms:z2 = A3:2 0 









».f/ @—-F ape 
| Om = CO; aca (307? + 63y + 15) 
| {s — 





oe ae SB / BHT etry (1215 + 6417 y + 79387? + 25207') 
a one V r(s — r) (15 + 63y + 307*)3? 
(76) { — (s — 1) (s? + s — 6rs 4+ 6r’) 
, r(s — r) (s — 2) (s — 3) 
9720 + 417555y + 270799272 + 584034373 
+ 478953074 + 1247400y° 
! (15 + 63y + 307’)? 
6s(r — 1)(s — r — 1) 
r(s — r) (s — 2) (s — 3) 














Allowing s — «, we have for an infinite parent population 


M 













i 3 
2y — Ma:z = A3:2 Oz 


Cz 


2 4/— (80r* + 63y + 15) 





(77) = 1 a3-2 (1215 + 64177 + 7938y? + 2520y°) 
Vr (15 + 637 + 307%)? 


3:2 7 





9720 + 4175557 + 27079927? + 5840343y° 
a 3 = 1 + 478953074 + 1247400,° 
4:2y nee 


(15 + 63y + 307°)? 
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The best approximated ‘most probable value’ of ji3., may now be written 
after the same fashion as in the preceding cases: 
(78) . a (x ts M,) Tz * A3:2w 


sa — ~ 2(1 + 2%.,,) 






where 





2.2 . 
204: 2% — 3a3: 2 — 6 


fh da connecanneniedinenaal 


<“w ‘ 
A4:2w i 3 









Since 


[= M,)* _1 Gr & 3 . z 

Lu ( —_—_* - 2 (« —m+g9 a :) [from (60)], 
Tr r 1 + 26. 

and since we assume the best approximated ‘most probable values’ of the 

standard deviation and the skewness for the standard deviation and the skew- 

ness of the parent population respectively, we obtain from (78) 





A A A3 
6° G3:2 = m3 + 3m aa 4 G2 O3:2 
3 3 werk 4 "a 4+ 2., 3 


Aiace (1215 + e414 + a= a moe 43 





The change of 3:2 to 62 &3.2 involves a systematic error although it is small. 
Again by proper substitution of (69) we have 











—=go'a3,+30'°9 Pe cinininas 
O3(die, 7 r,s s) * Sihan? r, s) (1 + 25...) 


o3 Os:2 





3 A383 
0, &3:z 





ot. iieiscnictiaeaielmtiaminl 
T 8 Gas 15 01 + Ba 


dave 03 (1215 + 64174 + 70384? + 252049) 


Y 98 i r, 8) - (15 + 634 + 3042) (1 + 26...) 





Solving for a3.,, we have 


(79) 3:2 }1 - 3g ¢° | (as: ry? »S) 24 9° 











(1+ 2%.) (1 + 26.,)8 
g (1215 + 64174 + 79384? + soa 
(1 + 26.) (15 + 634 + 304) J 
Since the right member of (79) is a function of d3.,, 7, and s, therefore the 
most probable value of a3:, may be approximated when we are given s, 7, and 


the skewness of an observed sample. As it is an algebraic equation of high 
order in 43:2 and is so much involved, even approximation presents practical 


= 


¢® (3-2, 1, s) 





104 CHING-LAI SHEN 


difficulty. However, if once ds. is approximated, ¢, and M, can be easily ob- 
tained from (60) and (68). 

Theorem XXI. For the best approximated ‘most probable value’ of the skew- 
ness of a parent population which is assumed to be distributed according to 
Type III, we must approximate it from equation (79), in which the skewness of 
an observed sample is expressed as a function of s, 7, and the best approximated 
‘most probable value’ of the skewness of the parent population. 

To construct a table for the best approximated ‘most probable value’ 43., 
corresponding to a3,, for particular values of r, s, we should first reverse the 
process by assigning different values of a3., so as to obtain a3.,; then by the 
way of interpolation, we shall be able to obtain 43., for a particular a3... 


TABLE VII 


Relation of the Sample Skewness and the Best A pproximated ‘Most Probable Value’ 
of the Parent Population Whose Distribution is According to Type III 


(s+ 0, r= 100) 





A 
QAsiz 





.0784 
. 1568 
. 2373 
.3164 
. 3969 
.4776 
. 9989 
.6410 
. 6239 
.8072 
. 8905 
9737 
.0567 
. 1392 
.2211 
. 3022 
3791 
.4578 
. 93595 
.6122 
. 6828 
. 609 
. 8303 
. 9024 
. 9670 


1 
2 
a) 
4 
Oo 
6 
| 


oo 





oP WNEKOODSOURWNHE 


NONNNNN WD & S&S BR BR BR Re RR 
On ee 





~~ 
Qe 
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For s + © and r = 100, we have computed the best approximated ‘most 
probable value’ of a3., corresponding to the values of as., from .1 to 2.6 as shown 
in Table VII. 

The computation for such a table is laborious because it involves the compu- 
tation of bee, bey5 and ben which are in turn functions of 3.2. and 4:2, 3:2, and 
G4:2,, aNd 3:2, aNd d4:2,, respectively. 





Section IV. DiIstTRIBUTION OF THE HYPOTHETICAL MEANS OF THE PARENT 


POPULATION 

























Since we have obtained in the preceding sections expressions for the best 
approximated ‘most probable values’ of the mean, the standard deviation and 
the skewness of a parent population which is assumed to be distributed according 
to Type III, we are now in the position to characterize the distribution of the 
hypothetical means of the parent population with the assumption that the best 
approximated ‘most probable values’ of the mean, the standard deviation, and 
the skewness be the mean, the standard deviation, and the skewness of the 
parent population. 

Basing upon the fundamental relations in (15), we write down the character- 
istics of the distribution of the hypothetical means of the parent population as 








follows: 
(Mu, =m 
si tig ail EE a pacing at BEM. 
oe = ON Fe) Bana V te) 
(80) | esis = — Anite = ~i=* i) thes 
— — 3 = 4:2, — 3 


_ (s —1)(s? + s — 6rs + 67°) | ee | _ 6s(r — 1)(s — r — 1) 
~ (sg — r)(s — 2)(s — 3) 2 r(s — r)(s — 2)(s — 3)" 
where $(@3:2, 8, 7) is given in (69). 
For an infinite parent population by allowing s — ~, we obtain from the 
above: 


(Mu, => m 





1 Os 
So 
Vr (a3-2, r) 
(81) __> 
eo — mr Q3:2 
| 3 a2 
it —3= oy a3. 


where $(da3:2, 7) = lim $(43:2, 8, 7) 


ss 
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Since we observe that the moments of the distribution of the hypothetical 
means are expressed in terms of 4:,, it is therefore necessary for us to find the 
best approximated ‘most probable value’ of the skewness of a parent population 
before we attempt to obtain the frequency function associated with the distribu- 
tion of these hypothetical means. 

Numerical illustration. A sample of 100 weights of freshman students is 
observed and the frequency distribution is given in Table VIII. 


TABLE VIII 
Weights of 100 Freshman Students 
(Original Measurements Correct to Nearest Pound) 





Class Mark Frequency 








109. 
119. 
129. 
139. 
149. 
159. 
169. 
179. 
189. 





The first four moments are computed 


m = 138.3 

os 14.6366 
3:5 = . 81099 
as:; = 4.47644 


Now, assuming this sample is drawn from an infinite parent population which 
is assumed to be distributed according to Type III, we wish to find (a) the best 
approximated ‘most probable values’ of the mean, the standard deviation, and 
the skewness of the parent population, and (b) the probability that the mean of 
the parent population lies between M, = 135 and M, = 140. 

By interpolation from Table VII, we obtain the best approximated ‘most 
probable value’ of the skewness of the parent population: 


Q3:2 = .6501 
From (69) and (61) we obtain 


G, = 14.5452 
M, = 138.25272, $(a3:2, 7) = 1.006279 
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From (81) we have 


My, = 138.3 

Cou, = 1.45452 
Q3:M, .06501 
Q4:My 3.00633945 


5., = 0, the distribution of M, is associated with Type III Function; hence 
for the probability that M, lies between M, = 135 and M, = 140, we again 
refer to Tables of Pearson’s Type IIT Function prepared by L. R. Salvosa,"® and 
we obtain in this case 


P = .8677592 


Since the determination of the best fit of a frequency curve in general depends 
upon the values of a3, a4, and k, and since in the present case each of them is a 
function of s, r, and as;,, we are therefore not able to tell the type of curve to 
be used until we know s, 7, and 43.2. 

For the infinite case, however, as we have illustrated Type I{I Function may 
always be used because 

3. = 204, 2, — 303.2, — 6 205.4, — a — 6 a 
. Ges, T 3 O4.u, +3 
holds for all values of @3., and r. We therefore conclude that the hypothetical 
means of an infinite parent population which is itself distributed according to 
Type III is distributed according to Type III. Hence 

Theorem XXII. The hypothetical means of an infinite parent population is 
distributed according to Type III if the parent population is assumed to be 
distributed according to Type III. 





Section V. DiIstTRIBUTION OF THE HYPOTHETICAL VARIANCES OF THE 
PARENT POPULATION 


Parallel to Part III, Section V, the distribution of the hypothetical variances 
of a parent population which is assumed to be distributed according to Type III 
can be described. The fundamental relation of Theorems II and III hold: 


Ben:p = Ben:zy or Q2n:p = A2n:zy 


Ben+ti:p = — Pen+i:z An+i:p = — A2n+1:2 
y y 


But now M, = —+——_ ” (Bee Part IV, Section IT) 
r 


(82) M, = - Z ( —m+q9 ay a. 


oa 
s@3:z 


(1 + 26.,)?¢7(ds:2, 7, 8) 


= m2 + g? [from (60)]. 


19 Salvosa, L. R., Annals of Mathematical Statistics Vol. I, No. II, 1930. 
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Upon the same assumption that the best approximated ‘most probable values’ 
of the mean, the standard deviation and the skewness be the mean, the standard 
deviation, and the skewness of the parent population, the distribution of je:, is 
characterized by 








2 a2 ( g?a? 
Mi». = M+ 9? 7s 3:2 — = ms 1 + ————ai oi, 7 
| . , (1 + 26..)°b*(ds:2, r, s) ‘ (1 + 25.,)?6?(as:z, r, s) 


ieee gd a 
| Case = Oz ya 1) Oy Ya Db (37 + 2) o. 


= 4/ 3 (s— 5 Ft 2) ag tt s) 


i _ s—2r i ont 
— sn V r(s — ) rm 
| _ _ s—2r tm | 307? + 567 + 8 
(83) } s—2 V 7e—n (34 + 2)! 
ON ES 3] 
r(@ — r) (@— 2) (8 — 3) 4:4 — ¢ 














| 4: fi: — 3= 4:24 — 3= 

6s(r — 1) @-r-)) 

~ 7(s — r) (s — 2) (s — 3) 

_ (s — 1) (s?+ s — 6rs Gs 6r' *) | Se + 16804? + 9124 + “ 
ae ae (34 + 2)? 


| _ 6s (r — 1) (s—r—1)) 
r(s — r) (s — 2) (s — 3)° 








For an infinite parent population, we have 
a2 
1 QAg:zx \ 


47?” § (2, )) 


_— / (34 + 2) 2) Mm: 
Cjig:z V r ¢? (d3:2, r) 
m _ 1 r+ 564 567 + 8 8 
ry ae 
Fe “a 4 2)32 


1 {63043 + 168042 + 9124 + = 
Q4:5,.5 — 3 — 
, { (34 + 2)? 





Mi. = Me \! ~ 








(84) 











r 


Numerical illustration. Using the same sample in Table VIII, we wish to 
ascertain the probability that the variance of the parent population lies between 
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306.25 and 342.25. From m = 138.3, 0, = 14.6366, a3.. = .81099, and ay;, = 


4.47644, we find from (84) 
Miu:2 = 214.232 ,235 


= 34.335,74 
— 495,311 


City: 


B:i2 = 
3.463 ,675,7 
.105 ,515 ,6 


From Part I, Section ITI, 


2 
k = ——[3#es_ == 976 < 1 
45,, (2 + 52,) 


Therefore, the best fitting curve will be Type IV which assumes the form” 
y = Yo (1 + x?)-™ e> tan—'z 


where 
‘+p 
q ’ 


c= 


t being in standard units 


—— . 
P~ 3, 2% 


Abb, — b? 48 (2 + 8) — 3 


4b; 46 


” 1 
1 g(2m —2,r) Fm —2,r) 


yo is found from Pearson’s Tables for Statisticians ard Biometricians® to be 


049662. 


20 Elderton, W. P., op. cit., p. 64. 
21 Pearson, K., Tables for Statisticians and Biometricians, Vol. 1, pp. 126-142. 
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Now the given limits 306.25 and 342.25 of the variance, when expressed in 
standard units, are 








ta = 2.679 ,941 
t = 3.728 ,410 
















Therefore the probability that je., lies between fe:; = 306.25 and fe.. = 
342.25 is 


£ =3.728, 410 
P= Yo (1 + x2)-™ e7 tan—z dr 
t 


=2.679, 041 
a 


we find 
m = 11.477,271 
A = 12.940 ,307 


- 36343 
P = .049662 / (1 + x2)1La77271_g—12.940207 tan—z qy 


08757 


By means of Maclaurin-Euler’s Interpolation Formula, P is found to be equal 
to .000,904. 

No definite law can be ascertained before we know a3:, because, as we have 
seen, @3:;,., aNd a4.;,., are both expressed in terms of s, 7, and a3:2. We do 
not know the value of k, which is a determining factor of the best fitting curve 
and a function of s, 7, a@s;,,., and a4;,.,, until we know the values of s, r 
and Q3:2 s 


? 









SecTion VI. DistRIBUTION OF THE HyporHETICAL THIRD MOMENTS OI THE 


PARENT PopuLaTION ABoutT Its MEAN 


Recalling the fact that the distribution of the third moments of sample means 
about the most probable value of the mean of the parent population is equivalent 
to the consideration of a distribution of sample means drawn from a parent 
population, wi, we, ws, --- w, Where w; = (x; — M,)%, so we can write down in 
accordance with the fundamental relations stated in Theorems II and III: 







Hen:p = Men:zw . Q2n:zp = A2n:2y 
or 
Benti:p = — Menit:zy Q2n41:p = — AWn+t:2y 
x — M,)* 
But here M, = d(x — Mz) 
, 


1 Ox O3:2 > 
M,= (= — m + g 3 ) 


r 


(86) ° 


= th, + Ine ————_—_— + 
(1 + 25.,) 6(4s.2, 7, 8) 


; and by the substitution of (60), we have 


(1 + 26.) 
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Consequently, with the same assumption that the best approximated ‘most 
probable values’ of the mean, the standard deviation, and the skewness be the 
mean, the standard deviation and the skewness of the parent population, the 
distribution of ji3.. is characterized by 


33 A3 
Os as: oy a 
= Ms + Sone, — a. g =e 


(1 + 28.,)6(dse,7,8) | (1 + 2b.,)°6*(ds0,7, 8) 


“Wie =o eaters ns 
= 1 634 30 os 

/ a = (5 + ¥+ $2) — $(én.,7, 8) 
_ s— 2r —s—1- 
3-24 ~= r) 


O3:2w 


_s—2r ine — &.2(1215, + 64174 + 79384? + 25204%) 
s—2 r(s — r) (15 + 634 + 3042)! 





(e 1) (s° + s — 6rs + 6r’) ils _ 6s(r - I)(e — r — I) 
r(s — r)(s — 2)(s — 3) tw ~~ F(s — r)(s — 2)(s — 3) 
9720 + 4175557 + 27079927? + 58403437 
_ (s ~~ 1)(s* + s — 6rs + 6r°) lt 47895307' + 12474007* 
r(s — r)(s — 2)(s — 3) (15 + 634 + 304)? 


_ G(r = Ds = r=) 
r(s —r)(s — 2)(s — 3)" 


Q4:p5:2 3= 





For an infinite parent population, we have 


A 3 43 
0, 3:2 1 CO, a3:2 


| Mj... = m 3m - 
ee *2roGa 1) Sr $(dne, 7) 


3 
= =4/; 1 45 + 634 + 304 ye) Oe 
$(a3:2, r) 
a a ae &:.(1215 + 64174 + 79387* + 2520%*) 
i Vv; (15 + 634 + 304%)! 


9720 + 4175554 + 27079924° + 58403434* + 47895304* 
3.! + 12474004° 
Q4:1;.. — 0 = —__----—__-~ : 


r ¥ ~ (15 + 634 + 3047)?” 








Numerical illustration. Using the same sample in Table VIII, we wish to 
ascertain the probability that the third moment of the parent population about 
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the mean lies between jfi3:. = 3000 and fi3:. = 4000, still assuming an infinite 
parent population from which the sample is drawn 


Ase = .6501 
(ds:2,7) = 1.006,279 


We find from (88) 


M ine = 2558. 137,096 

C ing = 1675.696,37 
Os:in2 = —1.187,409,9 
Curing =  6.127,551,6 
‘.. 0. 221,886 

k =  0.714,972 <1 


Therefore the best fitting curve is Type IV. 
From Pearson’s Tables for Statisticians and Biometricians, Vol. 1,22 we compute 


Yo = 000,058 ,032,3 


The given limits 3000 and 4000 when expressed in standard units are t = | 
.263,689 and ¢ = .860,455 respectively. Therefore the probability that ji. 
lies between 3000 and 4000 may be expressed by 


t=.860455 a 
P= (1 4 p2)76-800819 5-17. 440447 tam“! 
t 


==, 263689 


By means of Maclaurin-Euler’s Interpolation Formula, the answer is found to 
be .267 ,408 ,631. 
We make the same remark here as we have made in the preceding two sections. 
That is, since az.,,.. and a4.z,., are both in terms of s, r and 43:2,, we cannot 
determine the value of k which is a function of a3.4,., and au4.,,.. until we know 
the values of s, 7, and d3.,. Consequently, the curve associated with the dis- 
tribution of the hypothetical third moments of a parent population of Type III 

distribution is not known until we know s, r, and 43:2. 


22 Pearson, K., op. cit., pp. 126-142. 








os 


